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Chinese is written without using spaces or other word delimiters. Although a text may be 
thought of as a corresponding sequence of words, there is considerable ambiguity in the 
placement of boundaries. Interpreting a text as a sequence of words is beneficial for some 
information retrieval and storage tasks:for example, fulltext search, word-based 
compression, and keyphrase extraction. We describe a scheme that infers appropriate 
positions for word boundaries using an adaptive language model that ... 
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Ian H. Witten, Radford M. Neal, John G. Cleary 

June 1987 Communications of the ACM, volume 30 issue 6 

Full text available: fSl pdf(1 .62 MB) Additional Information: full citation , abstract , references , citings , index 
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The state of the art in data compression is arithmetic coding, not the better-known Huffman 
method. Arithmetic coding gives greater compression, is faster for adaptive models, and 
clearly separates the model from the channel encoding. 

6 Modeling word occurrences for the compression of concordances 
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July 1997 ACM Transactions on Information Systems (TOIS), volume is issue 3 

Full text available: fjl pdf(630.99 KB) AdditionaI Information: full citation , abstract, references, index terms, 
^ review 

An earlier paper developed a procedure for compressing concordances, assuming that all 
alements occurred independently. The models introduced in that paper are extended here to 
take the possiblity of clustering into account. The concordance is conceptualized as a set of 
bitmaps, in which the bit locations reporesent documents, and the one-bits represent the 
occurrence of given terms. Hidden Markov Models (HMM's) are used to describe the 
clustering of the one-bits. However, for computational ... 

Keywords: classification of graph nodes, concordance organization, concordance storage, 
• graph structure 
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Full text available: ^pdfd.OO MB) Additional Information: full citation , abstract , references , index terms 

Trellis coded quantization (TCQ) is incorporated into a noise feedback coding structure for 
encoding sampled speech. The effects of varying the encoding delay and the number of 
symbols released per trace-back on system performance and complexity are investigated. 

Keywords: Speech coding, trellis coded quantization, trellis coding 
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Arithmetic coding revisited 

Alistair Moffat, Radford M. Neal, Ian H. Witten 

July 1998 ACM Transactions on Information Systems (TOIS), volume 16 issue 3 

Full text available* 153 Ddf(487 26 KB) Additional Information: full citation , abstract, references , citings , index 
'te*^ : 1 terms 

Over the last decade, arithmetic coding has emerged as an important compression tool. It is 
now the method of choice for adaptive coding on myltisymbol alphabets because of its 
speed, low storage requirements, and effectiveness of compression. This article describes a 
new implementation of arithmetic coding that incorporates several improvements over a 
widely used earlier version by Witten, Neal, and Geary, which has become a de facto 
standard. These improvements include f ... 

Keywords: approximate coding, arithmetic coding, text compression, word-based model 
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Debra A. Lelewer, Daniel S. Hirschberg 

September 1987 ACM Computing Surveys (CSUR), Volume 19 issue 3 

Full text available- fiQpdf(3.61 MB) Additional Information: full citation , abstract , references , citings , index 
^ terms , review 

This paper surveys a variety of data compression methods spanning almost 40 years of 
research, from the work of Shannon, Fano, and Huffman in the late 1940s to a technique 
developed in 1986. The aim of data compression is to reduce redundancy in stored or 
communicated data, thus increasing effective data density. Data compression has important 
application in the areas of file storage and distributed systems. Concepts from information 
theory as they relate to the goals and evaluation of data ... 

11 A scheme for data compression in supercomputers 
M. A. Bassiouni, N. Ranganathan, A. Mukherjee 

November 1988 Proceedings of the 1988 ACM/IEEE conference on Supercomputing 

Full text available: ^ pdf(627.38 KB) Additional Information: full citation , abstract, references , index terms 

There is a growing recognition of the importance of efficient coding and data compression 
schemes in supercomputing centers and in networks of high-speed computing machines. 
Recently, there has been a considerable interest in arithmetic coding as a promising 
technique for reducing the cost of data storage and transmission. In this paper, we present 
a compression algorithm that is tailored to utilize the enormous speed and memory size of 
supercomputers and which utilizes an enhanced ... 

12 Comparative analysis of LISP and APL2 
A. Kaneko 

December 1987 ACM SIGAPL APL Quote Quad , Proceedings of the international 

conference on APL, Volume 18 issue 2 
Full text available: ^pdf(808.91 KB) Additional Information: full citation , abstract, references , index terms 

LISP and APL2 were both born in 1960's and they have the similar syntax which is called 
function type. The main users of LISP were in the universities and the applications were 
linguistics or formula processing in mathematics. On the other hand APL was used in the 
companies and many of their applications were in the business environment such as 
business planning or reporting. Traditional APL provided user intuitive expressions, but it 
was rather weak in hand ... 

13 An adaptive dependency source model for data compression 
David M. Abrahamson 
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February 1989 Communications of the ACM, volume 32 issue l 

Full text available- fifl pdf(534 71 KB) Additional Information: full citation , abstract , references , citings, index 
" ^ : terms , review 

By dynamically recoding data on the basis of current inte (-character probabilities, the 
entropy of encoded messages can be significantly reduced. 

14 Optimal prefetching via data compression 
Jeffrey Scott Vitter, P. Krishnan 

September 1996 Journal of the ACM (JACM), Volume 43 issue 5 

Full text available: I B pdf(564.53 KB) Additional Information: full citation , abstract, references , citings, index 
" terms , review 

Caching and prefetching are important mechanisms for speeding up access time to data on 
secondary storage. Recent work in competitive online algorithms has uncovered several 
promising new algorithms for caching. In this paper, we apply a form of the competitive 
philosophy for the first time to the problem of prefetching to develop an optimal universal 
prefetcher in terms of fault rate, with particular applications to large-scale databases and 
hypertext systems. Our prediction algorithms wit ... 

Keywords: Markov source, caching, competitive analysis, data compression, databases, 
fault rate, hypertext, prediction, prefetching, secondary stage, universal prefetcher 



15 Session P12: approximation and compression: Real-time decompression and 
visualization of animated volume data 
Stefan Guthe, Wolfgang StraBer 

October 2001 Proceedings of the conference on Visualization '01 

Full text available: f£| pdf(1.52 MB) Additional Information: full citation , abstract , references , citings , index 
' la y terms 

Interactive exploration of animated volume data is required by many application, but the 
huge amount of computational time and storage space needed for rendering does not allow 
the visualization of animated volumes by now. In this paper we introduce an algorithm 
running at interactive frame rates using 3d wavelet transforms that allows for any wavelet, 
motion compensation techniques and various encoding schemes of the resulting wavelet 
coefficients to be used. We analyze different families and o ... 

Keywords: compression for visualization, time critical visualization, volume rendering 



16 Data compression with finite windows 
E. R. Fiala, D. H. Greene 

April 1989 Communications of the ACM, volume 32 issue 4 

Full text available: f ^l pdf(1.89 MB) Additional Information: full citation , abstract , references , citings , index 
^ terms , review 

Several methods are presented for adaptive, invertible data compression in the style of 
Lempers and Ziv's first textual substitution proposal. For the first two methods, the article 
describes modifications of McCreight's suffix tree data structure that support cyclic 
maintenance of a window on the most recent source characters. A percolating update is 
used to keep node positions within the window, and the updating process is shown to have 
constant amortized cost. Other methods explore the ... 
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William R. Sanders, Gerard V. Benbassat, Robert L Smith 

February 1976 Proceedings of the ACM SIGCSE-SIGCUE technical symposium on 

Computer science and education, volume 2 , 8 issue si , l 
Full text available: ^pdf(1.03 MB) Additional Information: full citation , abstract , references, index terms 

The Institute for Mathematical Studies in the Social Sciences at Stanford (IMSSS) has 
developed a synthesis system, MISS (Microprogrammed Intoned Speech Synthesizer), 
designed to test the effectiveness of computer-generated speech in the context of complex 
CAI programs. No one method of computer controlled speech production is completely 
satisfactory for all the uses of computer-assisted instruction (CAI). The choice of synthesis 
method is strongly related to the kinds of curriculums and in ... 

18 Coding polygon meshes as compressable ASCII 
Martin Isenburg, Jack Snoeyink 

February 2002 Proceeding of the seventh international conference on 3D Web 
technology 

Full text available: I Sl pdf(472.45 KB) Additional Information: full citation , abstract, references , citings, index 
. terms 

Because of the convenience of a text-based format 3D content is often published in form of 
a gzipped file that contains an ASCII description of the scene graph. While compressed 
image, audio, and video data is kept in seperate binary files, polygonal data is usually 
included uncompressed into the ASCII description, as there is no widely-accepted standard 
for compressed polygon meshes. In this paper we show how to incorporate compression of 
polygonal data into a purely text-based scene graph descr ... 

Keywords: ASCII scene descriptions, fast and extremely light-weight decoding, mesh 
compression, non-manifold mesh encoding 



19 On the containment and equivalence of database queries with linear constraints 
(extended abstract) 
Oscar H. Ibarra, Jianwen Su 

May 1997 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on 
Principles of database systems 

Full text available: ^ pdf(1.70 MB) Additional Information: full citation , references , citings , index terms 



20 Application of splay trees to data compression 
D. W. Jones 

August 1988 Communications of the ACM, Volume 31 issue 8 

Full text available* 1 ^ pdfd .22 MB) Additional Information: full citation , abstract , references , citings , index 

The splay-prefix algorithm is one of the simplest and fastest adaptive data compression 
algorithms based on the use of a prefix code. The data structures used in the splay-prefix 
algorithm can also be applied to arithmetic data compression. Applications of these 
algorithms to encryption and image processing are suggested. 
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